Disfluency Detection using a Noisy Channel Model and a Deep Neural Language Model
نویسندگان
چکیده
This paper presents a model for disfluency detection in spontaneous speech transcripts called LSTM Noisy Channel Model. The model uses a Noisy Channel Model (NCM) to generate n-best candidate disfluency analyses and a Long Short-Term Memory (LSTM) language model to score the underlying fluent sentences of each analysis. The LSTM language model scores, along with other features, are used in a MaxEnt reranker to identify the most plausible analysis. We show that using an LSTM language model in the reranking process of noisy channel disfluency model improves the state-of-the-art in disfluency detection.
منابع مشابه
The impact of language models and loss functions on repair disfluency detection
Unrehearsed spoken language often contains disfluencies. In order to correctly interpret a spoken utterance, any such disfluencies must be identified and removed or otherwise dealt with. Operating on transcripts of speech which contain disfluencies, we study the effect of language model and loss function on the performance of a linear reranker that rescores the 25-best output of a noisychannel ...
متن کاملMelanoma detection with a deep learning model
Background: Skin cancer is one of the most common forms of cancer in the world and melanoma is the deadliest type of skin cancer. Both melanoma and melanocytic nevi begin in melanocytes (cells that produce melanin). However, melanocytic nevi are benign whereas melanoma is malignant. This work proposes a deep learning model for classification of these two lesions. Methods: In this analytic s...
متن کاملA comparison between a DNN and a CRF disfluency detection and reconstruction system
We propose to compare between a Deep Neural Network and a Conditional Random Field disfluency detection and reconstruction system, both trained on the same features. Deep Neural Networks, despite an increasing popularity in a multitude of speech and language related tasks, were never applied to disfluency recognition. One of the most difficult classes of disfluency is false starts. We are inter...
متن کاملA Deep Network Model for Paraphrase Detection in Short Text Messages
This paper is concerned with paraphrase detection. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Given two sentences, the objective is to detect whether they are semantically identical. An important insight from this work is ...
متن کاملDetecting Overlapping Communities in Social Networks using Deep Learning
In network analysis, a community is typically considered of as a group of nodes with a great density of edges among themselves and a low density of edges relative to other network parts. Detecting a community structure is important in any network analysis task, especially for revealing patterns between specified nodes. There is a variety of approaches presented in the literature for overlapping...
متن کامل